Fast Practical Multi-Pattern Matching

نویسندگان

  • Maxime Crochemore
  • Artur Czumaj
  • Leszek Gasieniec
  • Thierry Lecroq
  • Wojciech Plandowski
  • Wojciech Rytter
چکیده

The multi-pattern matching problem consists in finding all occurrences of the patterns from a finite set X in a given text T of length n. We present a new and simple algorithm combining the ideas of the Aho–Corasick algorithm and the directed acyclic word graphs. The algorithm has time complexity which is linear in the worst case (it makes at most 2n symbol comparisons) and has good average-case time complexity assuming the shortest pattern is sufficiently long. Denote the length of the shortest pattern by m, and the total length of all patterns byM . Assume thatM is polynomial with respect to m, the alphabet contains at least 2 symbols and the text (in which the pattern is to be found) is random, for each position each letter occurs independently with the same probability. Then the average number of comparisons is O((n/m) · logm), which matches the lower bound of the problem. For sufficiently large values of m the algorithm has a good behavior in practice.  1999 Elsevier Science B.V. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Performance Evaluation of Local Detectors in the Presence of Noise for Multi-Sensor Remote Sensing Image Matching

Automatic, efficient, accurate, and stable image matching is one of the most critical issues in remote sensing, photogrammetry, and machine vision. In recent decades, various algorithms have been proposed based on the feature-based framework, which concentrates on detecting and describing local features. Understanding the characteristics of different matching algorithms in various applications ...

متن کامل

SigMatch: Fast and Scalable Multi-Pattern Matching

Multi-pattern matching involves matching a data item against a large database of “signature” patterns. Existing algorithms for multipattern matching do not scale well as the size of the signature database increases. In this paper, we present sigMatch – a fast, versatile, and scalable technique for multi-pattern signature matching. At its heart, sigMatch organizes the signature database into a (...

متن کامل

A Novel Fast Negative Selection Algorithm Enhanced by State Graphs

Negative Selection Algorithm is widely applied in Artificial Immune Systems, but it is not fast enough when there are mass data need to be processed. Multi-pattern matching algorithms are able to locate all occurrences of multi-patterns in an input string by just one scan operation. Inspired by the multi-pattern matching algorithm proposed by Aho and Corasick in 1975 [1], a novel fast negative ...

متن کامل

A general compression algorithm that supports fast searching

The task of compressed pattern matching [2] is to report all the occurences of a given pattern P in a text T available in compressed form. Certain compression algorithms allow for searching without prior decoding which may be practical, especially if the search is faster than in the non-compressed representation. Most of the known schemes, however, either assume a text formed into words, or are...

متن کامل

Fast and Robust 3D Correspondence Matching and Its Application to Volume Registration

This paper presents a fast and accurate volume correspondence matching method using 3D Phase-Only Correlation (POC). The proposed method employs (i) a coarse-to-fine strategy using multi-scale volume pyramids for correspondence search and (ii) high-accuracy POC-based local block matching for finding dense volume correspondence with subvoxel displacement accuracy. This paper also proposes its GP...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Inf. Process. Lett.

دوره 71  شماره 

صفحات  -

تاریخ انتشار 1999